gh-144191: Dataclasses single field ordering #144222

whyvineet · 2026-01-25T15:33:54Z

Simplify single-field dataclass ordering comparisons
#144191

Issue: Can dataclass ordering avoid tuple wrappers for single field comparisons? #144191

…eld-ordering

bedevere-app · 2026-01-25T15:33:58Z

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

johnslavik

Thanks! You can wait for the green light from Eric before going forward or add tests for this even now. I'd probably change test_1_field_compare. And of course a news entry, too.

Lib/dataclasses.py

…eld-ordering

bedevere-app · 2026-01-26T14:49:13Z

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

whyvineet · 2026-01-29T17:09:46Z

Hey @picnixz and @johnslavik, just a quick update: I ran a small local benchmark on a CPython 3.15 dev build to ensure this change doesn't cause any noticeable slowdown. In my tests, the match/case version was slightly faster in the single-field scenario and comparable in the multi-field case. Since this code executes only during class creation, I didn’t notice any adverse effects.

I’m happy to follow your advice on this. If you’d rather test it yourselves or revert to the if/else approach for clarity, just let me know, and I’ll update the PR accordingly.

whyvineet · 2026-01-29T17:11:04Z

Just for reference, here are the numbers I observed locally (CPython 3.15 dev build):

Single-field case: match/case ~8–9% faster than if len(flds) == 1
Multiple-field case: no meaningful difference (well under 1%)

These were consistent across repeated runs.

picnixz · 2026-01-29T17:45:38Z

Please share the benchmarking script and the way you ran it. Class creation matters when considering import time as well.

whyvineet · 2026-01-29T18:08:59Z

The script was executed using the freshly built interpreter. Each implementation (match/case vs if/else) is run back-to-back for a large number of iterations to reduce noise.

Script

import timeit
import sys
from dataclasses import dataclass

class MockField:
    def __init__(self, name):
        self.name = name

def _tuple_str(obj_name, flds):
    if not flds:
        return '()'
    return f'({",".join(f"{obj_name}.{f.name}" for f in flds)},)'

def match_impl(flds):
    match flds:
        case [single_fld]:
            self_expr = f"self.{single_fld.name}"
            other_expr = f"other.{single_fld.name}"
        case _:
            self_expr = _tuple_str("self", flds)
            other_expr = _tuple_str("other", flds)
    return self_expr, other_expr

def ifelse_impl(flds):
    if len(flds) == 1:
        self_expr = f"self.{flds[0].name}"
        other_expr = f"other.{flds[0].name}"
    else:
        self_expr = _tuple_str("self", flds)
        other_expr = _tuple_str("other", flds)
    return self_expr, other_expr

def run_benchmark():
    single_field = [MockField("value")]
    multi_field = [MockField("x"), MockField("y"), MockField("z")]
    iterations = 1_000_000

    # Just to make sure that I am using the local build
    print("Python:", sys.version)
    print("Executable:", sys.executable)
    print()

    print("Single-field case")
    print("match/case:", timeit.timeit(lambda: match_impl(single_field), number=iterations))
    print("if/else:", timeit.timeit(lambda: ifelse_impl(single_field), number=iterations))

    print("Multi-field case")
    print("match/case:", timeit.timeit(lambda: match_impl(multi_field), number=iterations))
    print("if/else:", timeit.timeit(lambda: ifelse_impl(multi_field), number=iterations))

    print("Real dataclass comparison (sanity check)")
    @dataclass(order=True)
    class A:
        x: int

    @dataclass(order=True)
    class B:
        x: int
        y: int
        z: int

    a1, a2 = A(1), A(2)
    b1, b2 = B(1, 2, 3), B(2, 3, 4)

    print("single-field:", timeit.timeit(lambda: a1 < a2, number=iterations))
    print("multi-field: ", timeit.timeit(lambda: b1 < b2, number=iterations))

if __name__ == "__main__":
    run_benchmark()

picnixz · 2026-02-01T10:33:59Z

Can you use pyperf instead of custom benchmarks? I am also interested in the stdev.

whyvineet · 2026-02-01T11:14:29Z

@picnixz, Please excuse any rough edges here... I’m still getting familiar with pyperf, so I’m happy to adjust if I’m misusing it in any way.

PS D:\dev\cpython> PCbuild\amd64\python.exe benchmark.py --rigorous
.........................................
WARNING: the benchmark result may be unstable
* Not enough samples to get a stable result (95% certainly of less than 1% variation)

Try to rerun the benchmark with more runs, values and/or loops.
Run 'python.exe -m pyperf system tune' command to reduce the system jitter.
Use pyperf stats, pyperf dump and pyperf hist to analyze results.
Use --quiet option to hide these warnings.

dataclasses single-field match/case: Mean +- std dev: 403 ns +- 37 ns
.........................................
WARNING: the benchmark result may be unstable
* the standard deviation (49.3 ns) is 11% of the mean (454 ns)

Try to rerun the benchmark with more runs, values and/or loops.
Run 'python.exe -m pyperf system tune' command to reduce the system jitter.
Use pyperf stats, pyperf dump and pyperf hist to analyze results.
Use --quiet option to hide these warnings.

dataclasses single-field if/else: Mean +- std dev: 454 ns +- 49 ns

Script

import pyperf

class MockField:
    def __init__(self, name):
        self.name = name

def _tuple_str(obj_name, flds):
    if not flds:
        return '()'
    return f'({",".join(f"{obj_name}.{f.name}" for f in flds)},)'

def match_impl(flds):
    match flds:
        case [single_fld]:
            self_expr = f"self.{single_fld.name}"
            other_expr = f"other.{single_fld.name}"
        case _:
            self_expr = _tuple_str("self", flds)
            other_expr = _tuple_str("other", flds)
    return self_expr, other_expr

def ifelse_impl(flds):
    if len(flds) == 1:
        self_expr = f"self.{flds[0].name}"
        other_expr = f"other.{flds[0].name}"
    else:
        self_expr = _tuple_str("self", flds)
        other_expr = _tuple_str("other", flds)
    return self_expr, other_expr

runner = pyperf.Runner()
single_field = [MockField("value")]

runner.bench_func(
    "dataclasses single-field match/case",
    match_impl,
    single_field
)

runner.bench_func(
    "dataclasses single-field if/else",
    ifelse_impl,
    single_field
)

picnixz · 2026-02-01T11:44:36Z

I would like the results with the dataclass usage. And with more than just 3 fields. Is the interpreter built with PGO and LTO? Or is it a debug build?

whyvineet · 2026-02-01T12:49:21Z

@picnixz, Here are the results with dataclass usage (I'm using a debug build)

match-cases

PS D:\dev\cpython> PCbuild\amd64\python.exe benchmark.py --rigorous
.........................................
WARNING: the benchmark result may be unstable
* Not enough samples to get a stable result (95% certainly of less than 1% variation)

Try to rerun the benchmark with more runs, values and/or loops.
Run 'python.exe -m pyperf system tune' command to reduce the system jitter.
Use pyperf stats, pyperf dump and pyperf hist to analyze results.
Use --quiet option to hide these warnings.

dataclass creation (1 field): Mean +- std dev: 530 us +- 47 us
.........................................
WARNING: the benchmark result may be unstable
* Not enough samples to get a stable result (95% certainly of less than 1% variation)

Try to rerun the benchmark with more runs, values and/or loops.
Run 'python.exe -m pyperf system tune' command to reduce the system jitter.
Use pyperf stats, pyperf dump and pyperf hist to analyze results.
Use --quiet option to hide these warnings.

dataclass creation (5 fields): Mean +- std dev: 876 us +- 71 us
.........................................
WARNING: the benchmark result may be unstable
* the standard deviation (178 us) is 14% of the mean (1.28 ms)

Try to rerun the benchmark with more runs, values and/or loops.
Run 'python.exe -m pyperf system tune' command to reduce the system jitter.
Use pyperf stats, pyperf dump and pyperf hist to analyze results.
Use --quiet option to hide these warnings.

dataclass creation (10 fields): Mean +- std dev: 1.28 ms +- 0.18 ms

if-else

PS D:\dev\cpython> PCbuild\amd64\python.exe benchmark.py --rigorous
.........................................
WARNING: the benchmark result may be unstable
* the standard deviation (61.7 us) is 12% of the mean (521 us)
* the maximum (813 us) is 56% greater than the mean (521 us)

Try to rerun the benchmark with more runs, values and/or loops.
Run 'python.exe -m pyperf system tune' command to reduce the system jitter.
Use pyperf stats, pyperf dump and pyperf hist to analyze results.
Use --quiet option to hide these warnings.

dataclass creation (1 field): Mean +- std dev: 521 us +- 62 us
.........................................
WARNING: the benchmark result may be unstable
* Not enough samples to get a stable result (95% certainly of less than 1% variation)

Try to rerun the benchmark with more runs, values and/or loops.
Run 'python.exe -m pyperf system tune' command to reduce the system jitter.
Use pyperf stats, pyperf dump and pyperf hist to analyze results.
Use --quiet option to hide these warnings.

dataclass creation (5 fields): Mean +- std dev: 835 us +- 57 us
.........................................
WARNING: the benchmark result may be unstable
* Not enough samples to get a stable result (95% certainly of less than 1% variation)

Try to rerun the benchmark with more runs, values and/or loops.
Run 'python.exe -m pyperf system tune' command to reduce the system jitter.
Use pyperf stats, pyperf dump and pyperf hist to analyze results.
Use --quiet option to hide these warnings.

dataclass creation (10 fields): Mean +- std dev: 1.14 ms +- 0.07 ms

Script

import pyperf
from dataclasses import dataclass

def make_dataclass(n):
    namespace = {f"x{i}": int for i in range(n)}
    return dataclass(order=True)(
        type(f"C{n}", (), {"__annotations__": namespace})
    )

runner = pyperf.Runner()

runner.bench_func("dataclass creation (1 field)", make_dataclass, 1)
runner.bench_func("dataclass creation (5 fields)", make_dataclass, 5)
runner.bench_func("dataclass creation (10 fields)", make_dataclass, 10)

picnixz · 2026-02-02T11:10:08Z

Results on DEBUG builds are not relevant. Please use a PGO/LTO build.

picnixz · 2026-02-02T11:13:07Z

In addition can you use pyperf compare as well? And if possible remove the time needed for creating the type. We are only interested in the time needed by the decorator.

whyvineet · 2026-02-02T17:28:19Z

@picnixz, I ran pyperf compare_to ifelse_version.json matchcase_version.json (timing only the @dataclass(order=True) decorator) on PGO/LTO build and here are the results:

Benchmark hidden because not significant (3): dataclass decorator (1 field), dataclass decorator (5 fields), dataclass decorator (10 fields)

Geometric mean: 1.01x slower

whyvineet added 2 commits January 25, 2026 21:00

Enhance ordering methods in dataclasses for single field cases

b229292

Merge https://github.com/whyvineet/cpython into dataclasses-single-fi…

0284aa0

…eld-ordering

whyvineet requested a review from ericvsmith as a code owner January 25, 2026 15:33

bedevere-app bot mentioned this pull request Jan 25, 2026

Can dataclass ordering avoid tuple wrappers for single field comparisons? #144191

Open

bedevere-app bot added the awaiting review label Jan 25, 2026

johnslavik reviewed Jan 26, 2026

View reviewed changes

Lib/dataclasses.py Outdated Show resolved Hide resolved

whyvineet added 3 commits January 26, 2026 20:18

Merge https://github.com/whyvineet/cpython into dataclasses-single-fi…

6663637

…eld-ordering

Special-case single field comparisons in ordering methods

2443aa9

Add tests for ordering with dataclass fields, including compare=False

9ec5dc3

📜🤖 Added by blurb_it.

6bf7002

whyvineet requested a review from picnixz February 1, 2026 10:26

Uh oh!

gh-144191: Dataclasses single field ordering #144222

Are you sure you want to change the base?

gh-144191: Dataclasses single field ordering #144222

Conversation

whyvineet commented Jan 25, 2026 • edited by bedevere-app bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bedevere-app bot commented Jan 25, 2026

Uh oh!

johnslavik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

bedevere-app bot commented Jan 26, 2026

Uh oh!

whyvineet commented Jan 29, 2026

Uh oh!

whyvineet commented Jan 29, 2026

Uh oh!

picnixz commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

whyvineet commented Jan 29, 2026

Uh oh!

picnixz commented Feb 1, 2026

Uh oh!

whyvineet commented Feb 1, 2026

Uh oh!

picnixz commented Feb 1, 2026

Uh oh!

whyvineet commented Feb 1, 2026

Uh oh!

picnixz commented Feb 2, 2026

Uh oh!

picnixz commented Feb 2, 2026

Uh oh!

whyvineet commented Feb 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

whyvineet commented Jan 25, 2026 •

edited by bedevere-app bot

Loading

picnixz commented Jan 29, 2026 •

edited

Loading